AITopics | sparse and structured neural attention

Collaborating Authors

sparse and structured neural attention

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Regularized Framework for Sparse and Structured Neural Attention

Neural Information Processing SystemsNov-21-2025, 14:42:58 GMT

Modern neural networks are often augmented with an attention mechanism, which tells the network where to focus within the input. We propose in this paper a new framework for sparse and structured attention, building upon a smoothed max operator. We show that the gradient of this operator defines a mapping from real values to probabilities, suitable as an attention mechanism. Our framework includes softmax and a slight generalization of the recently-proposed sparsemax as special cases. However, we also show how our framework can incorporate modern structured penalties, resulting in more interpretable attention mechanisms, that focus on entire segments or groups of an input. We derive efficient algorithms to compute the forward and backward passes of our attention mechanisms, enabling their use in a neural network trained with backpropagation. To showcase their potential as a drop-in replacement for existing ones, we evaluate our attention mechanisms on three large-scale tasks: textual entailment, machine translation, and sentence summarization. Our attention mechanisms improve interpretability without sacrificing performance; notably, on textual entailment and summarization, we outperform the standard attention mechanisms based on softmax and sparsemax.

attention mechanism, regularized framework, sparse and structured neural attention, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.84)

Add feedback

Reviews: A Regularized Framework for Sparse and Structured Neural Attention

Neural Information Processing SystemsOct-7-2024, 16:30:59 GMT

Summary This paper presents a framework for implementing different sparse attention mechanisms by regularizing the max operator using convex functions. As a result, softmax and sparsemax are derived as special cases of this framework. Furthermore, two new sparse attention mechanisms are introduced that allow the model to learn to pay the same attention to contiguous spans. My concerns are regarding to the motivation of interpretability, as well as the baseline attention models. However, the paper is very well presented and the framework is a notable contribution that I believe will be useful for researchers working with attention mechanisms.

attention mechanism, mechanism, sparse and structured neural attention, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.43)

Add feedback

A Regularized Framework for Sparse and Structured Neural Attention

Niculae, Vlad, Blondel, Mathieu

Neural Information Processing SystemsFeb-14-2020, 12:58:10 GMT

attention mechanism, regularized framework, sparse and structured neural attention, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)

Add feedback